IBM DataStages Transformers

Passo a passo de como iniciar sua jornada na Dadosfera

DataStage Transformers

This section documents all deterministic transformers for IBM DataStage stages.


Available Transformers

  1. CopyStageTransformer - Dataset and stage reads with column filtering
  2. LookupStageTransformer - Multi-table JOINs with view mappings
  3. ModifyStageTransformer - Column renames, keep, and drop operations
  4. JoinStageTransformer - SQL JOINs (LEFT, INNER, RIGHT, FULL)
  5. AggregatorStageTransformer - GROUP BY with aggregations
  6. RemoveDuplicatesTransformer - DISTINCT and ROW_NUMBER() deduplication
  7. FunnelStageTransformer - UNION ALL for multiple inputs
  8. InputStageTransformer - Oracle SQL conversion and table reads
  9. ImportStageTransformer - CSV/file reads mapped to tables
  10. OutputOracleTransformer - SELECT passthrough for output stages
  11. TransformerStageTransformer - Column derivations (hybrid with LLM)

Stage Type Mapping

DataStage Stage TypeTransformerSQL Output
COPYCopyStageTransformerSELECT with column filtering
LOOKUPLookupStageTransformerSELECT with multiple JOINs
MODIFYModifyStageTransformerSELECT with renames/filters
JOINJoinStageTransformerSELECT with JOIN
AGGREGATORAggregatorStageTransformerSELECT with GROUP BY
REMOVE_DUPLICATESRemoveDuplicatesTransformerSELECT with DISTINCT or ROW_NUMBER()
FUNNELFunnelStageTransformerUNION ALL
INPUTInputStageTransformerOracle SQL → Snowflake SQL
IMPORTImportStageTransformerSELECT from mapped table
OUTPUT_ORACLEOutputOracleTransformerSELECT passthrough
TRANSFORMERTransformerStageTransformerSELECT with derivations

Common Features

All DataStage transformers:

  • Parse .dsx orchestrate code
  • Extract column definitions from modify sections
  • Support parameter substitution ([&"param"], #param#)
  • Generate Snowflake-compatible SQL
  • Build output schemas from stage metadata

Documentation Format

Each transformer is documented with:

  • Overview - What the transformer does
  • Capabilities - Features and supported operations
  • DataStage Stage Example - Real .dsx format input
  • Generated SQL Output - Actual SQL produced
  • Output Schema - Resulting column definitions
  • Limitations - Known issues and unsupported features

Next Steps

Start with CopyStageTransformer to see the documentation format with real examples.